A Reversible and Reusable Morpho-Lexical Description of Romanian

نویسندگان

  • Dan Tufiş
  • Ana-Maria Barbu
چکیده

Constructing a natural language dictionary and/or a grammar for computational use is a farreaching project, requiring very important human and material resources. Generalisation of the lexical approaches in natural language modelling confers an essential role to the dictionary in every system for automatic natural language processing. More and more information, which was traditionally encoded by grammar rules (syntax), enters the dictionary. Thus, specific linguistic knowledge is encapsulated into lexical structures, while the syntax only encodes generic rules, conceptual restrictions, which get instantiated through specific lexical elements. Modern linguistic theories, such as Lexical-Functional Grammar (LFG), Categorial Grammars (CG), Head-driven Phrase Structure Grammar (HPSG) or Lexicalized Tree Adjoining Grammar (LTAG), emphasise the essential contribution of the lexical specifications to defining and representing the grammatical restrictions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Romanian Lexical Data Bases: Inflected and Syllabic Forms Dictionaries

This paper presents two lexical data bases for Romanian: RoMorphoDict, a dictionary of inflected forms and RoSyllabiDict, a dictionary of syllabified inflected forms. Each data basis is available in two Unicode formats: text and XML. An entry of RoMorphoDict, in text format, contains information on inflected form, its lemma, its morpho-syntactic description and the marking of the stressed vowel...

متن کامل

RoCo-News: A Hand Validated Journalistic Corpus of Romanian

The paper briefly describes the Ro-Co project and, in greater details, one of its first outcomes, the RoCo-News corpus. Ro-Co- is a series of various registers corpora of Romanian that are developed within the Research Institute for Artificial Intelligence of the Romanian Academy. They are planned for public release to the research community as the underlying automatic annotations are v...

متن کامل

Some Considerations on Climatic Geomorphology of the Romanian Territory

The main issues about relation between climate and relief in the Romanian geomorphological literature follow three directions of research: the description of Pleistocene glacial and Pleistocene-Holocene periglacial landforms in the Carpathians; the direct analysis of climate?s role in landform development; the integration of separate works and findings in a synthetic morphodynamic system applic...

متن کامل

Using a Large Set of EAGLES-compliant Morpho-syntactic Descriptors as a Tagset for Probabilistic Tagging

The paper presents one way of reconciling data sparseness with the requirement of high accuracy tagging in terms of fine-grained tagsets. For lexicon encoding, EAGLES elaborated a set of recommendations aimed at covering multilingual requirements and therefore resulted in a large number of features and possible values. Such an encoding, used for tagging purposes, would lead to very large tagset...

متن کامل

Adding Morpho-semantic Relations to the Romanian Wordnet

Keeping pace with other wordnets development, we present the challenges raised by the Romanian derivational system and our methodology for identifying derived words and their stems in the Romanian Wordnet. To attain this aim we rely only on the list of literals in the wordnet and on a list of Romanian affixes; the automatically obtained pairs require automatic and manual validation, based on a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997